Inductive Learning in Less Than One Sequential Data Scan
نویسندگان
چکیده
Most recent research of scalable inductive learning on very large dataset, decision tree construction in particular, focuses on eliminating memory constraints and reducing the number of sequential data scans. However, state-of-the-art decision tree construction algorithms still require multiple scans over the data set and use sophisticated control mechanisms and data structures. We first discuss a general inductive learning framework that scans the dataset exactly once. Then, we propose an extension based on Hoeffding’s inequality that scans the dataset less than once. Our frameworks are applicable to a wide range of inductive learners.
منابع مشابه
Sequential Inductive Learning
In this paper I advocate a new model for inductive learning. Called sequential induction, this model bridges classical fixed-sample learning techniques (which are efficient but ad hoc), and worst-case approaches (which provide strong statistical guarantees but are too inefficient for practical use). According to the sequential inductive model, learning is a sequence of decisions which are infor...
متن کاملScaling Up Inductive Learning with MassiveParallelismFOSTER
Machine learning programs need to scale up to very large data sets for several reasons, including increasing accuracy and discovering infrequent special cases. Current inductive learners perform well with hundreds or thousands of training examples, but in some cases, up to a million or more examples may be necessary to learn important special cases with conndence. These tasks are infeasible for...
متن کاملThe Task Rehearsal Method of Sequential Learning
An hypothesis of functional transfer of task knowledge is presented that requires the development of a measure of task relatedness and a method of sequential learning. The task rehearsal method (TRM) is introduced to address the issues of sequential learning, namely retention and transfer of knowledge. TRM is a knowledge based inductive learning system that uses functional domain knowledge as a...
متن کاملA dynamic model of reasoning and memory.
Previous models of category-based induction have neglected how the process of induction unfolds over time. We conceive of induction as a dynamic process and provide the first fine-grained examination of the distribution of response times observed in inductive reasoning. We used these data to develop and empirically test the first major quantitative modeling scheme that simultaneously accounts f...
متن کاملInductive Logic Programming Used to Discover Topological Constraints in Protein Structures
This paper describes the application of the Inductive Logic Programming (ILP) program GOLEM to the discovery of constraints in the packing of beta-sheets in alpha/beta proteins. These constraints (rules) have a role in understanding the protein folding problem. Constraints were learnt for four features of beta-sheet packing: the winding direction of two sequential strands, whether two consecuti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003